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Abstract 

Prediction of various weather quantities is mostly based on deterministic numerical 
weather forecasting models. Multiple runs of these models with different initial con- 
ditions result ensembles of forecasts which are applied for estimating the distribution 
of future weather quantities. However, the ensembles are usually under-dispersive and 
uncalibrated, so post-processing is required. 

In the present work Bayesian Model Averaging (BMA) is applied for calibrating 
ensembles of wind speed forecasts produced by the operational Limited Area Model 
Ensemble Prediction System of the Hungarian Meteorological Service (HMS). 

We describe two possible BMA models for wind speed data of the HMS and show 
that BMA post-processing significantly improves the calibration and precision of fore- 
casts. 

Key words: Bayesian Model Averaging, gamma distribution, continuous ranked prob- 
ability score. 



1 Introduction 

The aim of weather forecasting is to give a good prediction of the future states of the 
atmosphere on the basis of present observations and mathematical models describing the 
dynamics (physical behaviour) of the atmosphere. These models consist of sets of non- 
linear partial differential equations which have only numerical solutions. The problem with 
these numerical weather prediction models is that the solutions highly depend on the initial 
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conditions which are always in a way or in another not fully accurate. A possible solution 
to address this problem is to run the model with different initial conditions and produce 
ensembles of forecasts. With the help of ensembles one can estimate th e distribution of future 
weather variables which leads us to probabilistic weather for ecasti n g flGneiting and Rafterv . 



20051 ). The ensemble predic tion method was proposed by iLeithI 



1974 ) and since its first 



operational implementation (iBuizza et all Il993l : iToth and Kalnayl . 119971 ) it became a widely 
used technique all over the world. However, despite e.g. the ensemble mean gives a better 
estimate of a meteorological quantity than most or all of the ensemble members, the ensemble 
is usually under- dispersive and in this way, uncalibrated. This pheno mena was observed a t 
several operational ensemble prediction systems, for an overview see e.g. iBuizza et all (120051 ). 



The Bayesian model averaging (BMA) method for 



Dost-processing ensembles in order to 



calibrate them was introduced by iRaftery et all (120051 ). The basic idea of BMA is that to 



each ensemble member forecast corresponds a conditional probability density function (PDF) 
that can be interpreted as the conditional PDF of the future weather quantity provided the 
considered forecast is the best one. Then the BMA predictive PDF of the future weather 
quantity is the weighted sum of the individual PDFs corresponding to the ensemble members 
and the weights are base d on the relat i ve per formances of the ensemble members during a 
given training period. In iRaftery et all (120051 ) the BMA method was successfully applied to 
obtain 48 hour forecasts of surface temperature and sea level pressure in the North Ameri- 
can Pacific Northwest based on th e 5 members of the University of Washington Mesoscale 
Ensemble (iGrimit and Masd . |2002| ). These weather quantities can be modeled by n orma l 
distributions, so the predictive PDF is a Gaussian mixture. Later ISloughter et al\ (120071 ) 
developed a discrete-continuous BMA model for precipitation forecasting, where the discrete 
part corresponds to the event of no precipitation, while the cub ic root of the pr e cipita tion 
amount (if it is positive) is modeled by a gamma distribution. In ISloughter et al\ (120101 ) the 
BMA method was used for wind speed forecasting and the component P DFs follow gamma 
distribution. Finally, using von Mises distribution to model angular data iBao et al\ (120101 ) 
introduced a BMA scheme to predict surface wind direction. 

In the present work we apply the BMA method for calibrating ensemble forecasts of 
wind speed produced by the operational Limited Area Model Ensemble Prediction Sys- 
tem (LAMEPS) of the Hungarian Meteorological Service (HMS) called ALADIN-HUNEPS 



(IHagei I2OIOI : iHoranvi et all l201lh . ALADIN-HUNEPS covers a large part of Continental 
Europe with a horizontal resolution of 12 km and it is obtained by dynamical downscal- 
ing (by the AL ADIN hmited a rea raodel) of the glob a l ARP EGE based PEARP system of 
Meteo France (IHoranvi et all l2006l : iDescamps et all l2009l ). The ensemble consists of 11 
members, 10 initialized from perturbed initial conditions and one control member from the 
unperturbed analysis. This construction implies that the ensemble contains groups of ex- 
changeable forecasts (the ensemble members cannot be di stinguished) , so for post-processing 
one has to use the modification of BMA as suggested by lFraley et al\ (I2OIOI ). 
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Figure 1: Verification rank histogram of tlie 11-member ALADIN-HUNEPS ensemble. Pe- 
riod: October 1, 2010 - Marcli 25, 2011. 



2 Data 

As it was mentioned in tlie Introduction, BMA post-processing of ensemble predictions was 
applied for wind speed data obtained from the HMS. The data file contains 11 member ensem- 
bles (10 forecasts started from perturbed initial conditions and one control) of 42 hour fore- 
casts for 10 meter wind speed (given in m/s) for 10 major cities in Hungary (Miskolc, Szom- 
bathely, Gyor, Budapest, Debrecen, Nyi'regyhaza, Nagykanizsa, Pecs, Kecskemet, Szeged) 
produced by the ALADIN-HUNEPS system of the HMS, together with the corresponding 
validating observations, for the period between October 1, 2010 and March 25, 2011. The 
forecasts are initialized at 18 UTC, the startup speed of the anemometers measuring the 
validating observations is 0.1 m/s. The data set is fairly complete, since there are only two 
days (18.10.2010 and 15.02.2011) when three ensemble members are missing for all sites and 
one day (20.11.2010) when no forecasts are available. 

Figure [U shows the verification rank histogram of the raw ensemble, that is the histogram 
of ranks of validating observations with respect to the corresponding ensemble forecasts. This 
histogram is far from the desired uniform distribution, in most of the cases the ensemble 
members either underestimate, or overestimate the validating observations (the ensemble 
range contains the observed wind speed only in 61.21% of the cases). Hence, the ensemble 
is under- dispersive and in this way it is uncahbrated. 
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3 The model and diagnostics 



To obtain a probabilis tic forecast of wind speed the modification of BMA gamrna mod el of 
Sloughter et all (120101 ) for ensembles with exchangeable members (IFraley et all |2010| ) was 
used. The first idea is to have two exchangeable groups: one contains the control denoted 
by fc, the other one the 10 ensemble members corresponding to the different perturbed 
initial conditions which are denoted by /^^i, . . . , /^^lo, respectively. In this way we assume 
that the probability density function (PDF) of the forecasted wind speed x equals: 



p{x\fc, fi,i, fe,io; bo, bi, Co, ci) = ujg{x; fc, bo, bi, cq, ci) 



(3.1) 



1 - c; ^° 
+ —77— ^ gix; fej, bo, bi, Cq, Ci), 
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where u G [0,1], and g is the conditional PDF corresponding to the ensemble mem- 
bers. As we are working with wind speed data, g{x; f, bo, &i, Cq, Ci) is a gamma PDF with 
mean 60 + ^1/ and standard deviation cq + ci/. Here we restrict both the mean and the 
standard deviation parameters to constant values for all ensemble members, which reduces 
the number of parameters and simplifies calculations. Mean parameters bo, &i are esti- 
mated with the help of linear regression, while weight u and standard deviation parameters 
Co, Ci, by maximum likelihood method, using training data consisting of ensemble mem- 
bers and verifying observations from the preceding n days (training period). In order to 
handle the problem that the wind speed values under 0.1 m/s are consid ered to be ze ro, the 
maximum likelihood (ML) method for gamma distributions suggested by IWilksl ( 119901 ) is ap- 
plied, while the maximum of the likelihood function is found with the h e lp of E M algorithm 



( McLachlan and Krishnaru . Il997| ). For more details see ISloughter et al\ ( l2010l ): iFralev et al. 



(120101 ). Once the estimated parameters for a given day are available, one can use either the 
mean or the median of the predictive PDF ( 13. ip as a point forecast. 

Based on a more careful look on the ensemble members there are some differences in the 
generation of the ten exchangeable ensemble members. To obtain them only five perturba- 
tions are calculated and then they are added to (odd numbered mer nbers) and sub t racte d 
from (even numbered members) the unperturbed initial conditions ( iHoranyi et all 120111 ). 
Figure |2] shows the plume diagram of ensemble forecast of 10 meter wind speed for Debrecen 
initialized at 18 UTC, 22.10.2010. (solid line: control; dotted line: odd numbered members, 
dashed line: even numbered members). This diagram clearly illustrates that the behaviour 
of ensemble member groups {fi^i, /^,3, /^,5, fij, fe^g} and {/^^a, fe^^ fe,6, /^s, fe,io} 
really differ from each other. Therefore, in this way one can also consider a model with three 
exchangeable groups: control, odd numbered exchangeable members and even numbered 
exchangeable members. This idea leads to the following PDF of the forecasted wind speed 
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Figure 2: Plume diagram of ensemble forecast of 10 meter wind speed for Debrecen initialized 
at 18 UTC, 22.10.2010. 



x: 

qix\fc, fe,i, . . . , fe,io; bo, h, cq, ci) = Ucgix; fc, bo, bi, cq, ci) (3.2) 

5 

+ ^ {uJog{x; fe,2j-i, bo, &i, cq, ci) + u;efi'(x; fi,2j, bo, bi, cq, ci)) , 
i=i 

where for weights uJc,(jJo,^e ^ [0,1] we have Uc + + = 1, while PDF g and 
parameters bo,bi,CQ,Ci are the same as for the model fl3.ll) . Obviously, both the weights 
and the parameters can be estimated in the same way as before. 

As an illustration we consider the data and forecasts for Debrecen for two different dates 
30.12.2010 and 17.03.2011 for models ([XI]) and Figures EJi and show the PDFs 

of the two groups in model (13. ip . the overall PDFs, the median forecasts, the verifying 
observations, the first and last deciles and the ensemble members. The same functions and 
quantities can be seen on Figures [St and [3li, where besides the overall PDF we have three 
component PDFs and three groups of ensemble members. On 30.12.2010 the spread of the 
ensemble members is quite fair and the ensemble range contains the validating observation 
(3.2 m/s). In this case the ensemble mean (3.5697 m/s) overestimates, while BMA median 
forecasts corresponding to the two- and three-group models (3.2876 m/s and 3.2194 m/s. 
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Figure 3: Ensemble BMA PDFs (overall: thick black line; control: red line; sum of ex- 
changeable members on (a) and (b): light blue line; on (c) and (d): green (odd members) 
and blue (even members) lines), ensemble members (circles with the same colours as the 
corresponding PDFs), ensemble BMA median forecasts (vertical black line), verifying obser- 
vations (vertical orange line) and the first and last deciles (vertical dashed lines) for wind 
speed in Debrecen for models (jSl]): (a) 30.12.2010, (b) 17.03.2011; and ([S2D: (c) 30.12.2010, 
(d) 17.03.2011. 



respectively) are pretty close to the true wind speed. A different situation is illustrated 
on Figures |3]d and |3ll, where the spread of the ensemble is even higher, but all ensemble 
members underestimate the validating observation (6.1 m/s). Obviously, the same holds for 
the ensemble mean (3.2323 m/s) and due to the bias correction the BMA median forecasts 
corresponding to models (13.11) and (13. 2p also give bad results (3.3409 m/s and 3.0849 m/s, 
respectively). 

To check the performance of probabilistic forecasts based on models (13. ip and (13. 2 p and 
the corresponding point forecasts, as a reference we use the ensemble mean and the ensemble 
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Figure 4: Average widths and coverages of 66.7% and 90 % BMA prediction intervals corre- 
sponding to two-group model (13.11) for various training period lengths. 



median. We compare the mean absolute errors (MAE) and the root mean square errors 
(RMSE) of the s e point forecasts and also the me an continuous ranked probability scores 
(CRPS) (IWilksi . l2006t iGneiting and Rafteryl . 120071 ) and the coverages and average widths of 
66.7% and 90 % prediction intervals of the BMA predictive probability distributions and of 
the raw ensemble. We remark that f or MAE and RMSE the optimal point forecasts are the 



median and the mean, respectively fiGneiting 



2011 



Pinson and Hagedornl . 1201 ll ) . Further 



given a cumulative distribution function (CDF) F{y) and a real number x, the CRPS is 
defined as 



/oo 
{F{y) - l{y>x}y 
-oo 



dy. 



The mean CRPS of a probability forecast is the average of the CRPS values of the predictive 
CDFs and corresponding validating observations taken over all locations and time points 
considered. For the raw ensemble the empirical CDF of the ensemble replaces the predictive 
CDF. The coverage of a (1 — a)100%, a G (0, 1), prediction interval is the proportion of 
validating observations located between the lower and upper a/2 quantiles of the predictive 
distribution. For a calibrated predictive PDF this value should be around (1 — a) 100 %. 
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Figure 5: CRPS of BMA predictive distribution, MAE values of BMA median and RMSE 
values of BMA mean forecasts corresponding to two-group model (13. ip for various training 
period lengths. 



4 Results 



Data analysis provided below was performed with the help of the ensembleBMA package 



of R (iFraley et all |2009| . l201ll ). As a first step the length of the appropriate training pe 



riod was determined, then the performances of the BMA post-processed ensemble forecasts 
corresponding to models fl3.ip and fl3.2p were analyzed. 



4.1 Training period 



According to the results of e.g. iRaftery et al\ (|2005[ ) to determine the length of the training 
period to be used we compare the MAE values of BMA median forecasts, the RMSE values of 
BMA mean forecasts, the CRPS values of BMA predictive distributions and the coverages 
and average widths of 90% and 66.7% BMA prediction intervals for training periods of 
length 10, 11, . . . , 60 calendar days. In order to ensure the comparability of the results we 
consider verification results from 02.12.2010 to 25.03.2011 (114 days). 

Consider first the two-group model (13. ip . On Figure H] the average widths and coverages 
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Figure 6: 
spending 



Average widths and coverages of 66.7% and 90 % BMA prediction intervals corre- 
to three-group model fl3.2p for various training period lengths. 



of 66.7% and 90% BMA prediction intervals are plotted against the length of the training 
period. The average widths of the prediction intervals show an increasing trend, so shorter 
training periods yield sharper forecasts. Coverages of 66.7% and 90% prediction intervals 
are not monotonously increasing, too. For short training periods the coverage of the 66.7% 
prediction interval oscillates around the correct 66.7%, but for training periods not shorter 
than 17 days it stays above this level. The coverage of the 90 % prediction interval stabilizes 
above the correct 90 % for training periods longer than 24 days. Hence, to have calibrated 
forecasts, one should choose a training period not shorter than 25 days. 

Figure [5] shows CRPS values of BMA predictive distribution, MAE values of BMA median 
forecasts and RMSE values of BMA mean forecasts as functions of the training period length. 
CRPS and RMSE both take their minima at 28 days, the corresponding values are 0.7388 
and 1.3675, respectively. MAE takes its minimum of 1.0472 at 30 days, while the second 
smallest value (1.0476) is obtained with a training period of length 28 days. This means 
that for model (13.11) a 28 days training period seems to be reasonable and training periods 
longer than 30 days cannot be taken into consideration. 

Similar conclusions can be drawn from Figures [6] and [7] for the three-group model (13. 2p . In 
this case the 66.7 % and 90 % prediction intervals are slightly narrower than the corresponding 



10 



CRPS of BMA 



o 





°\ 


X 








10 




20 


30 40 
Days in Training Period 

MAE of BMA Median Forecast 


50 


60 




\-o 










10 




20 


30 40 
Days in Training Period 

RMSE of BMA Mean Forecast 


50 


60 














10 




20 


30 40 


50 


60 



Days in Training Period 

Figure 7: CRPS of BMA predictive distribution, MAE values of BMA median and RMSE 
values of BMA mean forecasts corresponding to three-group model (13. 2p for various training 
period lengths. 



intervals of model (I3.ip . their coverages stabilize above the correct 66.7% and 90% for 
training periods longer than 17 and 24 days, respectively. CRPS and MAE plotted on 
Figure [7] both reach their minima of 0.7372 and 1.0452, respectively, at 30 days, while values 
0.7376 and 1.0456 corresponding to training period of length 28 days are both the fourth 
smallest ones. RMSE takes its minimum of 1.3632 at 27 days, and increases afterwards. The 
fourth smallest value (1.3644) again corresponds to 28 days, while the RMSE corresponding 
to 30 days is significantly larger (1.3664). Moreover, 66.7% and 90% prediction intervals 
corresponding to 28 days are sharper than the appropriate prediction intervals calculated 
using training period of length 30 days (2.5813 and 4.4340 vs. 2.5831 and 4.4378). Hence, 
we suggest the use of a training period of length 28 days for both BMA models. 



4.2 Predictions using BMA post-processing 

According to the results of the previous subsection, to test the performance of BMA post- 
processing on the 11 member ALADIN-HUNEPS ensemble we use a training period of 28 
calendar days. In this way ensemble members, validating observations and BMA models are 
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Figure 8: PIT histograms for BMA post-processed forecasts using two-group (13. ip and three- 
group (13.21) models. 



available for 146 calendar days (on 20.11.2010 all ensemble members are missing). 

First we check the calibration of BMA post-processed forecasts with the help of prob- 
ability integral transform (PIT) histograms. The PIT is the val ue of the BMA pre dictive 
cumulative distribution evaluated at the verifying observations (IFralev et al\ . 120101 ) . The 
closer the histogram to the uniform distribution, the better is the calibration. On Figure 

[H] the PIT histograms corresponding to two- and three-group BMA models fl3.ip and fl3.2p 
are displayed. Compared to the verification rank histogram of the raw ensemble (see Figure 

[I] ) one can observe a large improvement with the use of calibration. However, these PIT 
histograms are still not perfect, e.g. Kolmogorov-Smirnov test rejects uniformity both for 
the two- and for the three-group model. The corresponding p-values are 0.0222 and 0.0187, 
respectively, so the PIT of the two-group model is slightly better. 

Table [1] gives the coverages and average widths of 66.7% and 90.0% prediction intervals 







Covera 


ge (%) 


Average Width 


Interval 


66.7% 


interval 


90.0% interval 


66.7% interval 


90.0 % interval 


Raw ensemble 


35 


.70 


55.14 


1.4388 


2.2001 


BMA model fj3.ip 


65 


.08 


90.34 


2.6359 


4.5297 


BMA model (K^ 


65 


.36 


90.21 


2.6153 


4.4931 



Table 1: Coverage and average width of prediction intervals. 
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Mean CRPS 


MAE 


RMSE 


median 


mean 


median 


mean 


Raw ensemble 


0.8599 


1.1215 


1.1090 


1.4634 


1.4440 


BMA model 


0.7577 


1.0678 


1.0763 


1.4213 


1.4067 


BMA model ([S2D 


0.7556 


1.0643 


1.0749 


1.4153 


1.4018 



Table 2: Mean CRPS of probabilistic, MAE and RMSE of deterministic forecasts. 



calculated using models fl3.ip and fl3.2l) . and the corresponding measures calculated from the 
raw ensembles. In the latter case the ensemble of forecasts corresponding to a given location 
and time is considered as a statistical sample. The BMA prediction intervals calculated 
from both models are approximately twice as wide, as the corresponding intervals of the raw 
ensemble. This comes from the small dispersion of the raw ensemble, see the verification 
rank histogram of Figure [TJ Concerning calibration one can observe that the coverages of 
both BMA prediction intervals are rather close to the right coverages, while the coverages 
of the prediction intervals calculated from the raw ensemble are quite poor. This also shows 
that BMA post-processing highly improves calibration. Further, BMA model (13.21) yields 
slightly sharper predictions but there is no big difference between the coverages of the two 
BMA models. 

On Table [2] the verification results of the model fit are given. Verification measures 
of probabilistic forecasts and point forecasts calculated using BMA models fl3.ip and fl3.2l) 
are compared to the corresponding measures calculated for the raw ensemble. Examining 
these results one can clearly observe the advantage of BMA post-processing which resulted 
a significant decrease in all verification scores. Further, the BMA median forecasts yield 
slightly lower MAE values than the BMA mean forecasts for both models, while in the 
case of RMSE values th e situatio n is iu st the opposite, which is a perfect illustration of 



the theoretical results of iGneitingi (120111 ) about the optimality of these verification scores. 
Finally, model (13.21) distinguishing three exchangeable groups of ensemble forecasts slightly 
outperforms model (13. ip . 

Figure [9] shows the BMA weights corresponding to models (13. ip and (13. 2p . Examining the 
behaviour of weight u of the control member of the ensemble in the two-group model (13. ip . 
one can observe that in 84.56 % of the cases there is a real mixture of gamma distributions. 
The values of u which are close to 1 correspond to a continuous time interval 17.11.2010 - 
09.12.2010, when the control member of the ensemble gives much better forecasts than the 
ten exchangeable ensemble members. This can clearly be seen from Table [3] where the MAE 
and RMSE values of the particular ensemble members are given for the above mentioned 
period. In all of these 23 subsequent days u > 0.995 but on 20.11.2010, when u = 0.9873. 
However, as it was mentioned earlier, on this particular day all ensemble forecasts are missing 
from the data set. The situation is quite different in the case of the three-group model (13. 2p . 
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Figure 9: BMA weights of two-group f l3.ip and three-group (13. 2p models. 



where the weight Uc of the control is close to 1 (greater than 0.98) only on 7 days, so in the 
remaining cases (95.30 %) a real mixture of gamma distributions present. Further, observe 
that there are 55 days (36.91 %) when all BMA weights are positive, the even numbered 
exchangeable members have nearly zero weights (less than 0.001) in 45 cases (30.20 %) at 
the beginning of the considered time period, while the odd numbered exchangeable members 
are almost zero in 53 cases (35.57%), mainly at the end of it. 

Finally, on Figure ITU] common bias parameters Bq, bi of both BMA models investigated 
and standard deviation parameters cq, ci of the two-group model (13. ip are plotted, together 
with the differences in standard deviation parameters of three- and two-group models. Bias 





Control 


Exchangeable members 




fc 




fi,2 


fe,3 


fiA 


fe,5 




fi,7 


fe,8 




fe,io 


MAE 


1.32 


1.60 


1.46 


1.52 


1.68 


1.51 


1.49 


1.56 


1.42 


1.41 


1.65 


RMSE 


1.69 


2.16 


1.86 


1.96 


2.26 


1.92 


1.95 


2.05 


1.89 


1.81 


2.23 



Table 3: MAE and RMSE of the control and exchangeable ensemble forecasts for the period 
17.11.2010 - 09.12.2010. 
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Figure 10: Parameters of two-group model (13.1 p and differences in standard deviation pa- 
rameters between three- and two-group models. 



parameters are rather stable, the relative standard deviations of bo and bi are 25.44 % 
and 9.97%, respectively. Hence, the BMA mean forecast of a particular day is mainly 
determined by the corresponding ensemble forecasts. The standard deviation parameters 
show more variability, for cq and Ci the relative standard deviations are equal to 23.41 % 
and 41.27% for model and 22.64% and 36.73% for model (K2^ . 



5 Conclusions 

In the present study the BMA ensemble post-processing method is applied for the 11 member 
ALADIN-HUNEPS ensemble of the HMS to obtain 42 hour predictions for 10 meter wind 
speed. Two different BMA models are investigated, one assumes two groups of exchangeable 
members (control and forecasts from perturbed initial conditions), while the other considers 
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three (control and forecasts from perturbed initial conditions with positive and negative 
perturbations). For both models a 28 days training period is suggested. The comparison 
of the raw ensemble and of the probabilistic forecasts shows that the mean CRPS values 
of BMA post-processed forecasts are considerably lower than the mean CRPS of the raw 
ensemble. Further, the MAE and RMSE values of BMA point forecasts (median and mean) 
are also lower than the MAEs and RMSEs of the ensemble median and of the ensemble 
mean. The calibrations of BMA forecasts are nearly perfect, the coverages of 66.7% and 
90.0 % prediction intervals are very close to the right values. The three-group BMA model 
slightly outperforms the two-group one and in almost all cases yields a real mixture of gamma 
distributions. 

In this way one can conclude that BMA post-processing of ensemble forecasts of wind 
speed data of the HMS significantly improves the precision and calibration of the forecasts, 
its operational application is worth considering. 
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